Data Mining in Incomplete Numerical and Categorical Data Sets: A Neuro Fuzzy Approach
نویسندگان
چکیده
There are many applications dealing with incomplete data sets that take different approaches to making imputations for missing values. Most tackle the problem for numerical input variables in the data set. However, when there are two types of input variables, numerical and categorical, the state of the art has provided no clear solutions. This paper presents a proposal for handling incomplete numerical and categorical data sets using an extension of an existing neuro-fuzzy approach. The method is extensively tested in a real environment in the field of the political election polls.
منابع مشابه
Automatic Clustering Subspace for High Dimensional Categorical Data Using Neuro-Fuzzy Classification
Clustering has been used extensively as a vital tool of data mining. Data gathering has been deliberated widely, but mostly all identified usual clustering algorithms lean towards to break down in high dimensional spaces because of the essential sparsely of the data points. Present subspace clustering methods for handling high-dimensional data focus on numerical dimensions. The minimum spanning...
متن کاملClustering Numerical and Categorical Data
Clustering is an important technique for data mining which allows us to discover unknown relationships in our data sets. Clustering algorithms that use metrics based on the natural ordering of numbers cannot be applied to categorical (non-numerical) data. In this tutorial we will review the main methods for numerical data clustering (K-Means, Hierarchical Clustering and Fuzzy CMeans) and then s...
متن کاملConstruction of α-cut fuzzy X control charts based on standard deviation and range using fuzzy triangular numbers
Control charts are one of the most important tools in statistical process control that lead to improve quality processes and ensure required quality levels. In traditional control charts, all data should be exactly known, whereas there are many quality characteristics that cannot be expressed in numerical scale, such as characteristics for appearance, softness, and color. Fuzzy sets theory is a...
متن کاملA Neuro Fuzzy System for Knowledge Discovery of Incomplete Construction Data
This paper tackles problems encountered in mining of incomplete data for knowledge discovery of construction databases. As historical construction data are expensive to collect, any waste of incomplete data means not only loss of knowledge but also increase of costs for knowledge discovery of construction engineering. Unfortunately, incompleteness is commonplace in the existing construction dat...
متن کاملOn a Fuzzy c-means Algorithm for Mixed Incomplete Data Using Partial Distance and Imputation
The focus of fuzzy c-means clustering method is normally used on numerical data. However, most data existing in databases are both categorical and numerical. To date, clustering methods have been developed to analyze only complete data. Although we sometimes encounter data sets that contain one or more missing feature values (incomplete data), traditional clustering methods cannot be used for s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009